Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 124494 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 1 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 11.4 MiB |
| Average record size in memory | 96.0 B |
Variable types
| Categorical | 3 |
|---|---|
| Numeric | 9 |
| Dataset has 1 (< 0.1%) duplicate rows | Duplicates |
date has a high cardinality: 304 distinct values | High cardinality |
device has a high cardinality: 1169 distinct values | High cardinality |
attribute7 is highly correlated with attribute8 | High correlation |
attribute8 is highly correlated with attribute7 | High correlation |
attribute3 is highly correlated with attribute9 | High correlation |
attribute7 is highly correlated with attribute8 | High correlation |
attribute8 is highly correlated with attribute7 | High correlation |
attribute9 is highly correlated with attribute3 | High correlation |
attribute7 is highly correlated with attribute8 | High correlation |
attribute8 is highly correlated with attribute7 | High correlation |
attribute3 is highly correlated with attribute4 and 1 other fields | High correlation |
attribute4 is highly correlated with attribute3 | High correlation |
attribute7 is highly correlated with attribute8 | High correlation |
attribute8 is highly correlated with attribute7 | High correlation |
attribute9 is highly correlated with attribute3 | High correlation |
attribute2 is highly skewed (γ1 = 23.8579234) | Skewed |
attribute3 is highly skewed (γ1 = 82.712278) | Skewed |
attribute4 is highly skewed (γ1 = 41.50261118) | Skewed |
attribute7 is highly skewed (γ1 = 73.47645637) | Skewed |
attribute8 is highly skewed (γ1 = 73.47645637) | Skewed |
attribute9 is highly skewed (γ1 = 49.89927809) | Skewed |
attribute2 has 118110 (94.9%) zeros | Zeros |
attribute3 has 115359 (92.7%) zeros | Zeros |
attribute4 has 115156 (92.5%) zeros | Zeros |
attribute7 has 123036 (98.8%) zeros | Zeros |
attribute8 has 123036 (98.8%) zeros | Zeros |
attribute9 has 97358 (78.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-10-22 15:15:26.930074 |
|---|---|
| Analysis finished | 2021-10-22 15:15:43.474908 |
| Duration | 16.54 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 304 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 972.7 KiB |
| 2015-01-02 | 1163 |
|---|---|
| 2015-01-01 | 1163 |
| 2015-01-03 | 1163 |
| 2015-01-04 | 1162 |
| 2015-01-05 | 1161 |
| Other values (299) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2015-01-01 |
|---|---|
| 2nd row | 2015-01-01 |
| 3rd row | 2015-01-01 |
| 4th row | 2015-01-01 |
| 5th row | 2015-01-01 |
Common Values
| Value | Count | Frequency (%) |
| 2015-01-02 | 1163 | 0.9% |
| 2015-01-01 | 1163 | 0.9% |
| 2015-01-03 | 1163 | 0.9% |
| 2015-01-04 | 1162 | 0.9% |
| 2015-01-05 | 1161 | 0.9% |
| 2015-01-06 | 1054 | 0.8% |
| 2015-01-07 | 798 | 0.6% |
| 2015-01-09 | 756 | 0.6% |
| 2015-01-08 | 756 | 0.6% |
| 2015-01-12 | 755 | 0.6% |
| Other values (294) | 114563 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 2015-01-02 | 1163 | 0.9% |
| 2015-01-01 | 1163 | 0.9% |
| 2015-01-03 | 1163 | 0.9% |
| 2015-01-04 | 1162 | 0.9% |
| 2015-01-05 | 1161 | 0.9% |
| 2015-01-06 | 1054 | 0.8% |
| 2015-01-07 | 798 | 0.6% |
| 2015-01-09 | 756 | 0.6% |
| 2015-01-08 | 756 | 0.6% |
| 2015-01-12 | 755 | 0.6% |
| Other values (294) | 114563 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 1169 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 972.7 KiB |
| Z1F0MA1S | 304 |
|---|---|
| S1F0GPXY | 304 |
| S1F0EGMT | 304 |
| W1F0JY02 | 304 |
| S1F0FP0C | 304 |
| Other values (1164) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | S1F01085 |
|---|---|
| 2nd row | S1F0166B |
| 3rd row | S1F01E6Y |
| 4th row | S1F01JE0 |
| 5th row | S1F01R2B |
Common Values
| Value | Count | Frequency (%) |
| Z1F0MA1S | 304 | 0.2% |
| S1F0GPXY | 304 | 0.2% |
| S1F0EGMT | 304 | 0.2% |
| W1F0JY02 | 304 | 0.2% |
| S1F0FP0C | 304 | 0.2% |
| Z1F0KJDS | 304 | 0.2% |
| Z1F0QL3N | 304 | 0.2% |
| S1F0H6JG | 304 | 0.2% |
| Z1F0Q8RT | 304 | 0.2% |
| Z1F0QK05 | 304 | 0.2% |
| Other values (1159) | 121454 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| z1f0ma1s | 304 | 0.2% |
| s1f0ggpp | 304 | 0.2% |
| w1f0jxdl | 304 | 0.2% |
| w1f05x69 | 304 | 0.2% |
| s1f0gced | 304 | 0.2% |
| w1f0fy92 | 304 | 0.2% |
| z1f0kkn4 | 304 | 0.2% |
| w1f0fzpa | 304 | 0.2% |
| s1f0e9ep | 304 | 0.2% |
| s1f0fgbq | 304 | 0.2% |
| Other values (1159) | 121454 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
failure
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 972.7 KiB |
| 0 | |
|---|---|
| 1 | 106 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 124388 | |
| 1 | 106 | 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0 | 124388 | |
| 1 | 106 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
attribute1
Real number (ℝ≥0)
| Distinct | 123877 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 122388103.2 |
| Minimum | 0 |
|---|---|
| Maximum | 244140480 |
| Zeros | 11 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 12090104.4 |
| Q1 | 61284762 |
| median | 122797388 |
| Q3 | 183309640 |
| 95-th percentile | 231873846.4 |
| Maximum | 244140480 |
| Range | 244140480 |
| Interquartile range (IQR) | 122024878 |
Descriptive statistics
| Standard deviation | 70459334.22 |
|---|---|
| Coefficient of variation (CV) | 0.5757041113 |
| Kurtosis | -1.199305658 |
| Mean | 122388103.2 |
| Median Absolute Deviation (MAD) | 61032236 |
| Skewness | -0.01114296352 |
| Sum | 1.523658453 × 1013 |
| Variance | 4.964517778 × 1015 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 89196552 | 26 | < 0.1% |
| 165048912 | 26 | < 0.1% |
| 57192360 | 26 | < 0.1% |
| 169490248 | 23 | < 0.1% |
| 57180136 | 15 | < 0.1% |
| 169467344 | 15 | < 0.1% |
| 12194976 | 15 | < 0.1% |
| 165040624 | 15 | < 0.1% |
| 89162648 | 15 | < 0.1% |
| 165045144 | 13 | < 0.1% |
| Other values (123867) | 124305 |
| Value | Count | Frequency (%) |
| 0 | 11 | |
| 2048 | 1 | < 0.1% |
| 2056 | 2 | < 0.1% |
| 2168 | 1 | < 0.1% |
| 3784 | 1 | < 0.1% |
| 4224 | 1 | < 0.1% |
| 4480 | 1 | < 0.1% |
| 4560 | 1 | < 0.1% |
| 8280 | 1 | < 0.1% |
| 8616 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 244140480 | 1 | |
| 244138600 | 1 | |
| 244136552 | 1 | |
| 244135688 | 1 | |
| 244133240 | 1 | |
| 244132936 | 1 | |
| 244132752 | 1 | |
| 244131712 | 1 | |
| 244129416 | 1 | |
| 244127840 | 1 |
| Distinct | 558 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 159.4847623 |
| Minimum | 0 |
|---|---|
| Maximum | 64968 |
| Zeros | 118110 |
| Zeros (%) | 94.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 8 |
| Maximum | 64968 |
| Range | 64968 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 2179.65773 |
|---|---|
| Coefficient of variation (CV) | 13.66687136 |
| Kurtosis | 626.8205749 |
| Mean | 159.4847623 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 23.8579234 |
| Sum | 19854896 |
| Variance | 4750907.822 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 118110 | |
| 2344 | 281 | 0.2% |
| 8 | 260 | 0.2% |
| 24 | 254 | 0.2% |
| 40 | 201 | 0.2% |
| 4960 | 175 | 0.1% |
| 424 | 169 | 0.1% |
| 16 | 166 | 0.1% |
| 88 | 152 | 0.1% |
| 552 | 140 | 0.1% |
| Other values (548) | 4586 | 3.7% |
| Value | Count | Frequency (%) |
| 0 | 118110 | |
| 8 | 260 | 0.2% |
| 16 | 166 | 0.1% |
| 24 | 254 | 0.2% |
| 32 | 132 | 0.1% |
| 40 | 201 | 0.2% |
| 48 | 90 | 0.1% |
| 56 | 104 | 0.1% |
| 64 | 26 | < 0.1% |
| 72 | 35 | < 0.1% |
| Value | Count | Frequency (%) |
| 64968 | 1 | < 0.1% |
| 64792 | 7 | |
| 64784 | 11 | |
| 64776 | 8 | |
| 64736 | 13 | |
| 64728 | 13 | |
| 64584 | 17 | |
| 64472 | 1 | < 0.1% |
| 64464 | 1 | < 0.1% |
| 62296 | 1 | < 0.1% |
| Distinct | 47 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.940454962 |
| Minimum | 0 |
|---|---|
| Maximum | 24929 |
| Zeros | 115359 |
| Zeros (%) | 92.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 24929 |
| Range | 24929 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 185.7473207 |
|---|---|
| Coefficient of variation (CV) | 18.68599791 |
| Kurtosis | 10473.5883 |
| Mean | 9.940454962 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 82.712278 |
| Sum | 1237527 |
| Variance | 34502.06713 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=47)
| Value | Count | Frequency (%) |
| 0 | 115359 | |
| 1 | 3274 | 2.6% |
| 2 | 749 | 0.6% |
| 7 | 298 | 0.2% |
| 34 | 293 | 0.2% |
| 5 | 278 | 0.2% |
| 21 | 269 | 0.2% |
| 4 | 268 | 0.2% |
| 9 | 262 | 0.2% |
| 8 | 251 | 0.2% |
| Other values (37) | 3193 | 2.6% |
| Value | Count | Frequency (%) |
| 0 | 115359 | |
| 1 | 3274 | 2.6% |
| 2 | 749 | 0.6% |
| 3 | 113 | 0.1% |
| 4 | 268 | 0.2% |
| 5 | 278 | 0.2% |
| 7 | 298 | 0.2% |
| 8 | 251 | 0.2% |
| 9 | 262 | 0.2% |
| 10 | 241 | 0.2% |
| Value | Count | Frequency (%) |
| 24929 | 4 | < 0.1% |
| 2693 | 179 | |
| 2112 | 6 | < 0.1% |
| 1331 | 240 | |
| 1326 | 5 | < 0.1% |
| 1162 | 1 | < 0.1% |
| 406 | 84 | 0.1% |
| 382 | 5 | < 0.1% |
| 377 | 6 | < 0.1% |
| 323 | 6 | < 0.1% |
| Distinct | 115 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.741120054 |
| Minimum | 0 |
|---|---|
| Maximum | 1666 |
| Zeros | 115156 |
| Zeros (%) | 92.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 6 |
| Maximum | 1666 |
| Range | 1666 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 22.90850654 |
|---|---|
| Coefficient of variation (CV) | 13.15733886 |
| Kurtosis | 2467.96284 |
| Mean | 1.741120054 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 41.50261118 |
| Sum | 216759 |
| Variance | 524.7996719 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 115156 | |
| 6 | 3681 | 3.0% |
| 1 | 889 | 0.7% |
| 2 | 711 | 0.6% |
| 3 | 466 | 0.4% |
| 12 | 454 | 0.4% |
| 4 | 359 | 0.3% |
| 10 | 294 | 0.2% |
| 112 | 245 | 0.2% |
| 5 | 231 | 0.2% |
| Other values (105) | 2008 | 1.6% |
| Value | Count | Frequency (%) |
| 0 | 115156 | |
| 1 | 889 | 0.7% |
| 2 | 711 | 0.6% |
| 3 | 466 | 0.4% |
| 4 | 359 | 0.3% |
| 5 | 231 | 0.2% |
| 6 | 3681 | 3.0% |
| 7 | 175 | 0.1% |
| 8 | 170 | 0.1% |
| 9 | 45 | < 0.1% |
| Value | Count | Frequency (%) |
| 1666 | 9 | |
| 1074 | 6 | < 0.1% |
| 1033 | 3 | < 0.1% |
| 841 | 1 | < 0.1% |
| 763 | 1 | < 0.1% |
| 533 | 1 | < 0.1% |
| 529 | 4 | < 0.1% |
| 521 | 6 | < 0.1% |
| 487 | 18 | |
| 486 | 15 |
attribute5
Real number (ℝ≥0)
| Distinct | 60 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.22266937 |
| Minimum | 1 |
|---|---|
| Maximum | 98 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 8 |
| median | 10 |
| Q3 | 12 |
| 95-th percentile | 58 |
| Maximum | 98 |
| Range | 97 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 15.943028 |
|---|---|
| Coefficient of variation (CV) | 1.120958913 |
| Kurtosis | 12.15213494 |
| Mean | 14.22266937 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 3.483679387 |
| Sum | 1770637 |
| Variance | 254.1801417 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 8 | 22145 | |
| 9 | 13597 | |
| 11 | 12792 | |
| 10 | 11480 | |
| 7 | 11271 | |
| 12 | 9843 | |
| 6 | 8542 | 6.9% |
| 13 | 6006 | 4.8% |
| 14 | 3517 | 2.8% |
| 5 | 3429 | 2.8% |
| Other values (50) | 21872 |
| Value | Count | Frequency (%) |
| 1 | 173 | 0.1% |
| 2 | 203 | 0.2% |
| 3 | 815 | 0.7% |
| 4 | 933 | 0.7% |
| 5 | 3429 | 2.8% |
| 6 | 8542 | 6.9% |
| 7 | 11271 | |
| 8 | 22145 | |
| 9 | 13597 | |
| 10 | 11480 |
| Value | Count | Frequency (%) |
| 98 | 224 | 0.2% |
| 95 | 672 | |
| 94 | 224 | 0.2% |
| 92 | 448 | |
| 91 | 215 | 0.2% |
| 90 | 357 | |
| 89 | 224 | 0.2% |
| 78 | 224 | 0.2% |
| 70 | 224 | 0.2% |
| 68 | 448 |
attribute6
Real number (ℝ≥0)
| Distinct | 44838 |
|---|---|
| Distinct (%) | 36.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 260172.6577 |
| Minimum | 8 |
|---|---|
| Maximum | 689161 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 46 |
| Q1 | 221452 |
| median | 249799.5 |
| Q3 | 310266 |
| 95-th percentile | 443047.8 |
| Maximum | 689161 |
| Range | 689153 |
| Interquartile range (IQR) | 88814 |
Descriptive statistics
| Standard deviation | 99151.07855 |
|---|---|
| Coefficient of variation (CV) | 0.3810972276 |
| Kurtosis | 1.907777201 |
| Mean | 260172.6577 |
| Median Absolute Deviation (MAD) | 35382.5 |
| Skewness | -0.3752846096 |
| Sum | 3.238993485 × 1010 |
| Variance | 9830936377 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 31 | 777 | 0.6% |
| 44 | 708 | 0.6% |
| 27 | 636 | 0.5% |
| 26 | 520 | 0.4% |
| 29 | 441 | 0.4% |
| 36 | 337 | 0.3% |
| 35 | 290 | 0.2% |
| 52 | 282 | 0.2% |
| 45 | 246 | 0.2% |
| 28 | 216 | 0.2% |
| Other values (44828) | 120041 |
| Value | Count | Frequency (%) |
| 8 | 19 | < 0.1% |
| 9 | 172 | |
| 12 | 51 | < 0.1% |
| 18 | 36 | < 0.1% |
| 19 | 30 | < 0.1% |
| 20 | 6 | < 0.1% |
| 21 | 58 | < 0.1% |
| 23 | 71 | 0.1% |
| 24 | 123 | |
| 25 | 184 |
| Value | Count | Frequency (%) |
| 689161 | 1 | |
| 689062 | 1 | |
| 689035 | 1 | |
| 688964 | 1 | |
| 688952 | 2 | |
| 687901 | 1 | |
| 687802 | 1 | |
| 687775 | 1 | |
| 687706 | 1 | |
| 687694 | 2 |
attribute7
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWEDZEROS| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.292528154 |
| Minimum | 0 |
|---|---|
| Maximum | 832 |
| Zeros | 123036 |
| Zeros (%) | 98.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 832 |
| Range | 832 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 7.436923979 |
|---|---|
| Coefficient of variation (CV) | 25.42293409 |
| Kurtosis | 6876.273007 |
| Mean | 0.292528154 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 73.47645637 |
| Sum | 36418 |
| Variance | 55.30783827 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=28)
| Value | Count | Frequency (%) |
| 0 | 123036 | |
| 8 | 793 | 0.6% |
| 16 | 397 | 0.3% |
| 24 | 65 | 0.1% |
| 48 | 36 | < 0.1% |
| 32 | 35 | < 0.1% |
| 128 | 23 | < 0.1% |
| 176 | 20 | < 0.1% |
| 40 | 20 | < 0.1% |
| 6 | 13 | < 0.1% |
| Other values (18) | 56 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 123036 | |
| 6 | 13 | < 0.1% |
| 8 | 793 | 0.6% |
| 16 | 397 | 0.3% |
| 22 | 2 | < 0.1% |
| 24 | 65 | 0.1% |
| 32 | 35 | < 0.1% |
| 40 | 20 | < 0.1% |
| 48 | 36 | < 0.1% |
| 56 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 832 | 2 | < 0.1% |
| 744 | 1 | < 0.1% |
| 736 | 4 | < 0.1% |
| 496 | 1 | < 0.1% |
| 424 | 1 | < 0.1% |
| 312 | 5 | < 0.1% |
| 272 | 2 | < 0.1% |
| 240 | 1 | < 0.1% |
| 216 | 1 | < 0.1% |
| 176 | 20 |
attribute8
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWEDZEROS| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.292528154 |
| Minimum | 0 |
|---|---|
| Maximum | 832 |
| Zeros | 123036 |
| Zeros (%) | 98.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 832 |
| Range | 832 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 7.436923979 |
|---|---|
| Coefficient of variation (CV) | 25.42293409 |
| Kurtosis | 6876.273007 |
| Mean | 0.292528154 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 73.47645637 |
| Sum | 36418 |
| Variance | 55.30783827 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=28)
| Value | Count | Frequency (%) |
| 0 | 123036 | |
| 8 | 793 | 0.6% |
| 16 | 397 | 0.3% |
| 24 | 65 | 0.1% |
| 48 | 36 | < 0.1% |
| 32 | 35 | < 0.1% |
| 128 | 23 | < 0.1% |
| 176 | 20 | < 0.1% |
| 40 | 20 | < 0.1% |
| 6 | 13 | < 0.1% |
| Other values (18) | 56 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 123036 | |
| 6 | 13 | < 0.1% |
| 8 | 793 | 0.6% |
| 16 | 397 | 0.3% |
| 22 | 2 | < 0.1% |
| 24 | 65 | 0.1% |
| 32 | 35 | < 0.1% |
| 40 | 20 | < 0.1% |
| 48 | 36 | < 0.1% |
| 56 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 832 | 2 | < 0.1% |
| 744 | 1 | < 0.1% |
| 736 | 4 | < 0.1% |
| 496 | 1 | < 0.1% |
| 424 | 1 | < 0.1% |
| 312 | 5 | < 0.1% |
| 272 | 2 | < 0.1% |
| 240 | 1 | < 0.1% |
| 216 | 1 | < 0.1% |
| 176 | 20 |
| Distinct | 65 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.45152377 |
| Minimum | 0 |
|---|---|
| Maximum | 18701 |
| Zeros | 97358 |
| Zeros (%) | 78.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 972.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 11 |
| Maximum | 18701 |
| Range | 18701 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 191.425623 |
|---|---|
| Coefficient of variation (CV) | 15.37367045 |
| Kurtosis | 4050.190542 |
| Mean | 12.45152377 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 49.89927809 |
| Sum | 1550140 |
| Variance | 36643.76914 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 97358 | |
| 1 | 9436 | 7.6% |
| 2 | 3722 | 3.0% |
| 3 | 2327 | 1.9% |
| 4 | 1396 | 1.1% |
| 6 | 797 | 0.6% |
| 7 | 774 | 0.6% |
| 5 | 735 | 0.6% |
| 8 | 733 | 0.6% |
| 10 | 641 | 0.5% |
| Other values (55) | 6575 | 5.3% |
| Value | Count | Frequency (%) |
| 0 | 97358 | |
| 1 | 9436 | 7.6% |
| 2 | 3722 | 3.0% |
| 3 | 2327 | 1.9% |
| 4 | 1396 | 1.1% |
| 5 | 735 | 0.6% |
| 6 | 797 | 0.6% |
| 7 | 774 | 0.6% |
| 8 | 733 | 0.6% |
| 9 | 335 | 0.3% |
| Value | Count | Frequency (%) |
| 18701 | 5 | < 0.1% |
| 10137 | 4 | < 0.1% |
| 7226 | 5 | < 0.1% |
| 2794 | 6 | < 0.1% |
| 2637 | 84 | |
| 2522 | 179 | |
| 2270 | 5 | < 0.1% |
| 2269 | 1 | < 0.1% |
| 1864 | 5 | < 0.1% |
| 1165 | 118 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| date | device | failure | attribute1 | attribute2 | attribute3 | attribute4 | attribute5 | attribute6 | attribute7 | attribute8 | attribute9 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2015-01-01 | S1F01085 | 0 | 215630672 | 56 | 0 | 52 | 6 | 407438 | 0 | 0 | 7 |
| 1 | 2015-01-01 | S1F0166B | 0 | 61370680 | 0 | 3 | 0 | 6 | 403174 | 0 | 0 | 0 |
| 2 | 2015-01-01 | S1F01E6Y | 0 | 173295968 | 0 | 0 | 0 | 12 | 237394 | 0 | 0 | 0 |
| 3 | 2015-01-01 | S1F01JE0 | 0 | 79694024 | 0 | 0 | 0 | 6 | 410186 | 0 | 0 | 0 |
| 4 | 2015-01-01 | S1F01R2B | 0 | 135970480 | 0 | 0 | 0 | 15 | 313173 | 0 | 0 | 3 |
| 5 | 2015-01-01 | S1F01TD5 | 0 | 68837488 | 0 | 0 | 41 | 6 | 413535 | 0 | 0 | 1 |
| 6 | 2015-01-01 | S1F01XDJ | 0 | 227721632 | 0 | 0 | 0 | 8 | 402525 | 0 | 0 | 0 |
| 7 | 2015-01-01 | S1F023H2 | 0 | 141503600 | 0 | 0 | 1 | 19 | 494462 | 16 | 16 | 3 |
| 8 | 2015-01-01 | S1F02A0J | 0 | 8217840 | 0 | 1 | 0 | 14 | 311869 | 0 | 0 | 0 |
| 9 | 2015-01-01 | S1F02DZ2 | 0 | 116440096 | 0 | 323 | 9 | 9 | 407905 | 0 | 0 | 164 |
Last rows
| date | device | failure | attribute1 | attribute2 | attribute3 | attribute4 | attribute5 | attribute6 | attribute7 | attribute8 | attribute9 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 124484 | 2015-11-02 | W1F0SJJ2 | 0 | 47525320 | 0 | 0 | 0 | 12 | 357421 | 0 | 0 | 0 |
| 124485 | 2015-11-02 | Z1F0GB8A | 0 | 92823192 | 0 | 0 | 0 | 9 | 357127 | 0 | 0 | 0 |
| 124486 | 2015-11-02 | Z1F0GE1M | 0 | 222878704 | 0 | 0 | 0 | 10 | 349826 | 0 | 0 | 0 |
| 124487 | 2015-11-02 | Z1F0KJDS | 0 | 79883648 | 0 | 0 | 0 | 11 | 358121 | 0 | 0 | 0 |
| 124488 | 2015-11-02 | Z1F0KKN4 | 0 | 218765712 | 0 | 0 | 0 | 9 | 353525 | 0 | 0 | 0 |
| 124489 | 2015-11-02 | Z1F0MA1S | 0 | 18310224 | 0 | 0 | 0 | 10 | 353705 | 8 | 8 | 0 |
| 124490 | 2015-11-02 | Z1F0Q8RT | 0 | 172556680 | 96 | 107 | 4 | 11 | 332792 | 0 | 0 | 13 |
| 124491 | 2015-11-02 | Z1F0QK05 | 0 | 19029120 | 4832 | 0 | 0 | 11 | 350410 | 0 | 0 | 0 |
| 124492 | 2015-11-02 | Z1F0QL3N | 0 | 226953408 | 0 | 0 | 0 | 12 | 358980 | 0 | 0 | 0 |
| 124493 | 2015-11-02 | Z1F0QLC1 | 0 | 17572840 | 0 | 0 | 0 | 10 | 351431 | 0 | 0 | 0 |
Most frequently occurring
| date | device | failure | attribute1 | attribute2 | attribute3 | attribute4 | attribute5 | attribute6 | attribute7 | attribute8 | attribute9 | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2015-07-10 | S1F0R4Q8 | 0 | 192721392 | 0 | 0 | 0 | 8 | 213700 | 0 | 0 | 0 | 2 |